308

simulating metabolism to understand how the metabolic signalling network works.

Databases would include PubMed, Gene Expression Omnibus (GEO) and

GENEVESTIGATOR.

Example 1.3

1. Question: Answer B

2. Question: Answer A

3. Question: Answer D

Reply Comment

If you didn’t find the right answer, here is the corresponding protein sequence: https://www.

ncbi.nlm.nih.gov/protein/AAX29205.1. To do this, it is best to select Protein next to the

search bar in PubMed and type HIV into the search bar, after which you should find the

entry “TAR, partial [synthetic construct], Accession: AAX29205.1”. Here you will find all

the information about the answers.

Question 1.4

The BLAST (Basic Local Alignment Search Tool) algorithm allows protein and nucleo­

tide sequences to be compared with a large database in terms of their local similarity. In

this process, a sequence is compared for its similarity with reference sequences in a data­

base and can provide information on which virus a patient has contracted. BLAST uses a

heuristic search and the two-hit method: A short word list (so-called lookup table) is first

compared with the short word lists of the database (indexed database). If at least one

matching short word is found in an entry, the algorithm immediately checks whether there

is another short word hit in the vicinity (fixed distance), and only then calculates the align­

ment. In all other cases, the algorithm blasts ahead to the next database entry.

With a BLAST search, one is thus able to identify homologous genes and compare the

individual positions in order to be able to identify unknown sequences, but also to find

corresponding differences in other organisms (e.g. for the development of an animal

model). However, sequence analysis can be taken much further bioinformatically. For

example, the patient’s virus can be compared with other patient isolates, related viruses

(HIV-1, HIV-2, etc.) and other sequences. In the clinic, by the way, HI viruses are now

even routinely sequenced according to resistance mutations, so that it is possible to recog­

nise in good time how the virus population changes under antiretroviral therapy, in order

to change and optimise the therapy accordingly. For further information, please use the

link to BLAST (https://blast.ncbi.nlm.nih.gov/Blast.cgi).

Question 1.5

So, in your own program, you would first read in the sequence (input part), then use an

algorithm (“two-hit method”) to calculate the similarity to the entries in the database

20  Solutions to the Exercises